Mutational scans reveal differential evolvability of Drosophila promoters and enhancers

Rapid enhancer and slow promoter evolution have been demonstrated through comparative genomics. However, it is not clear how this information is encoded genetically and if this can be used to place evolution in a predictive context. Part of the challenge is that our understanding of the potential for regulatory evolution is biased primarily toward natural variation or limited experimental perturbations. Here, to explore the evolutionary capacity of promoter variation, we surveyed an unbiased mutation library for three promoters in Drosophila melanogaster. We found that mutations in promoters had limited to no effect on spatial patterns of gene expression. Compared to developmental enhancers, promoters are more robust to mutations and have more access to mutations that can increase gene expression, suggesting that their low activity might be a result of selection. Consistent with these observations, increasing the promoter activity at the endogenous locus of shavenbaby led to increased transcription yet limited phenotypic changes. Taken together, developmental promoters may encode robust transcriptional outputs allowing evolvability through the integration of diverse developmental enhancers. This article is part of the theme issue ‘Interdisciplinary approaches to predicting evolutionary biology’.

and populations. Recently, mutational scans have been applied to developmental enhancers in fruit flies, where it was found that almost all mutations altered gene expression [17]. This study was further extended to additional elements [18], suggesting that developmental enhancers are often sensitive to perturbation, and may be highly constrained.
Metazoan promoters are traditionally thought to be functionally separated from enhancers, with the former primarily interacting with the transcription machinery (e.g. Pol II) and the latter interacting with transcription factors carrying spatial and temporal information. However, recent studies suggest that the boundary between promoters and enhancers can be blurry: enhancers can initiate certain levels of transcription, just like promoters, and many known promoters can influence transcription initiation of other genes, which is the classical definition of enhancers [19][20][21]. From an evolutionary standpoint, it has been found that rapid evolution of enhancers is a general feature of mammalian genomes [22]. By contrast, the genomic enrichment of key histone marks H3K27 acetylation and H3K4 trimethylation associated with promoters was partially or fully conserved across these species, suggesting that there is slow evolution of promoters in animal genomes. However, it is not known how this information is encoded genetically at developmental promoters, which have been distinguished from 'housekeeping' promoters by their distinct properties in the epigenetic and sequence signatures [23], the level of PolII stalling [24] and enhancer preferences [25].
Here, to understand if mutations in developmental promoters have a different distribution of effects on gene expression from those in enhancers, we examined random mutation libraries of three Drosophila promoters and compared them to a previously surveyed Drosophila E3N enhancer. By contrast with the previous findings that enhancers may be highly sensitive to mutations [17], we found that mutations in these promoters sometimes change the level of gene expression, but never the spatial pattern of expression. Together, these findings suggest that developmental promoters may encode robust transcriptional outputs allowing evolvability through the integration of diverse developmental enhancers.

Results
We focused our analyses on the regulatory sequences of shavenbaby (svb), a gene that encodes an essential regulator of trichome development in Drosophila. The evolution of the svb regulatory regions has been extensively studied due to contributions to phenotypic evolution across many Drosophila species [26][27][28][29]. Through these works, seven transcriptional enhancers have been characterized for svb; each integrates information from multiple patterning networks giving rise to the overall expression of svb across the embryo (figure 1a).
To explore how the native svb promoter (svbp) integrates these diverse activities, we tested the activity of the svbp using integrated reporter gene assays. The svbp shows high regulatory activity based on ReMap density [31] and is conserved among Drosophila species [32] (figure 1a). It does not contain TATA-box or other strong transcription motifs, consistent with signatures of developmentally regulated promoters in Drosophila [23]. Deletion of svbp resulted in severe depletion of ventral trichomes (figure 1b,c), recapitulating svb mutant phenotypes [33], although these lines were homozygous viable. In order to understand how different promoters control levels and patterns of transcription activities driven by developmental enhancers, we generated reporter constructs of svb promoter and Drosophila synthetic core promoter (DSCP), an artificially engineered promoter known to drive high levels of expression [34]. Both promoters were placed downstream of the E3N enhancer of svb, which drives expression in a pattern of eight stripes in the abdominal region (A1 to A8) in stage 15 embryos [17]. The design with the E3N enhancer was necessary because, in the absence of an enhancer, the construct (with hsp70 promoter) only drove a low level of background expression in stage 15 embryos [18]. We found that the two promoters drove different levels of reporter gene expression in the stripes, using the second abdominal stripe (A2) as a focal region for quantification (figure 1d, electronic supplementary material, figure S1). The nuclei intensity from DSCP was on average 2.4-fold higher than that of svbp. However, we found no differences in the overall gene expression patterns in the stripes (figure 1e,f). Additionally, we tested constructs without a promoter and with a random sequence as the promoter. We found that the reporter expression was below the level of detection in both lines (electronic supplementary material, figure S2), ruling out any promoter activity contributed by E3N in the constructs.
In order to understand the evolutionary potential of promoters in regulating the level and pattern of gene expression in a developmental context, we generated random mutation libraries of svbp and DSCP at a mutation rate of 1-2%, in a similar manner to our previous mutational scan on the E3N enhancer [17]. In the previous study, which examined the mutational profile of E3N in combination with the hsp70 promoter, single-point mutations in E3N almost always decreased the gene expression level (18/18 variants). They often changed the state or locations of gene expression (11/18, 61%). In the follow-up studies that examined E3N variants with 1-10 mutations, it was found that the majority of variants caused decreases in the level of gene expression (83/91 based on median intensity, 91%) [18] and reduction in the number of nuclei in A1 to A8 stripes (81/91, 89%) [35]. We independently recapitulated these results by analysing ten randomly selected lines with 2-3 mutations from the E3N library. We found that all ten lines had a reduced number of nuclei expressing lacZ (figure 2a-c, electronic supplementary material, figure S3) and that seven showed reduced levels of expression (false discovery rate (FDR)adjusted p < 0.05, Wilcoxon test, figure 2a). Together, these results are consistent with the previous finding that enhancers encode dense spatial information [17,18,36].
By contrast, we analysed 28 svb promoter variants (figure 2d-h), together covering 58 base pairs, and did not find any variants changing the pattern of gene expression (representative images in figure 2f-h). Unlike the enhancer, only two variants showed significantly lower expression than the wild-type svb promoter (figure 2d). Furthermore, one line showed higher expression levels than the wild-type promoter. The three lines with a changed expression level (89-6, 88-13 and 9-4) together contained seven unique mutations, and three were at highly conserved positions at the proximal end of the promoter (electronic supplementary material, figure S4A), suggesting potential functional relevance. However, we did not find a significant correlation between the presence of phenotypic effect and the level of conservation (phyloP score from 124 insects [32]), possibly royalsocietypublishing.org/journal/rstb Phil. Trans. R. Soc. B 378: 20220054 due to the small sample size (Wilcoxon test, p > 0.05). Interestingly, line 9-4 harvested a single T-to-A mutation at the first nucleotide position of the svb 5 0 UTR, which showed the most severe reduction in the level of gene expression in the library (figure 2d), suggesting a potentially critical role of transcription start sites. Taken together, given that the promoter library had a comparable mutation rate to the E3N enhancer library (on average 1%), our results suggest that developmental promoters are more robust than enhancers when subjected to the same mutation load.
We next extended our analysis to the DSCP. The DSCP was created by adding initiator (Inr) motif, motif ten element (MTE) and downstream promoter element (DPE) to a TATAcontaining promoter of the developmental gene even skipped (eve)-creating one of the strongest promoters available in fruit flies [34]. We quantitatively analysed 45 variants of DSCP, with the number of mutations ranging from 1-8 across the 255bp-long sequence and an average mutation rate of 1.5% ( figure 3). There were 117 nucleotide positions mutated in total, and 9 of them fell in the four functional motifs mentioned above (shaded regions in the left panel of figure 3a). We found that mutations in DSCP changed the expression level of the reporter gene more often than those in svbp, with 13 mutant lines showing significant changes, suggesting that the endogenous svb promoter might be more robust than the synthetic promoter. Among the variants showing changes in expression, 7-2 had a mutation in TATA, 17-2 had a mutation in Inr, and lines 13-11 and 49-1 both had mutations in MTE. However, we did not find a statistically significant enrichment of mutations in these transcriptional motifs in the lines that showed changes in expression versus ones that did not (Fisher's exact test, p > 0.05). Furthermore, there was not a correlation between the presence of an effect on gene expression and the level of conservation (for regions from the eve promoter, see electronic supplementary material, figure S4B, Wilcoxon test, p > 0.05).
Interestingly, although DSCP drove a high level of expression, mutations in DSCP increased its activity even further in 10 mutant lines. This suggests that developmental promoters such as those of svb and eve might have the evolutionary potential to drive higher expression. However, it remains to be tested if strong transcriptional motifs such as those artificially engineered into DSCP are prerequisites of such evolvability. Due to the multiple mutational paths that led to higher levels of expression (either through transcriptional motifs or point mutations), developmental promoters  Consistent with our findings from the svbp, mutations in the DSCP did not change gene expression patterns, supported by the 45 lines quantified above (e.g. figure 3b-d) and 21 additional DSCP variants examined quantitatively (electronic supplementary material, figure S5A-C). To further validate these results, we generated a mutation library of the hsp70 promoter (hsp70p), a promoter commonly used to drive constitutive expression in Drosophila and used in the E3N library. Similarly, we did not find any variants causing a change in the expression pattern in the 31 variants examined (covering 74 out of 268 bp; electronic supplementary material, figure S5D-F). Together, these results are consistent with the traditional view of promoters encoding little spatial information [37].
Although reporter constructs allowed us to examine the promoter variants in a controlled setting, it remains unknown whether the changes in transcription can lead to phenotypic outcomes at the endogenous locus, where complex promoter-enhancer interactions are involved. Therefore, we next tested if a change in the promoter activity at the endogenous locus could lead to phenotypic outcomes. We knocked out the svb promoter at its endogenous locus and replaced it with DSCP using CRISPR/Cas9. We found that the stronger DSCP promoter produced higher levels of transcription based on the local levels of nascent svb transcription compared to the endogenous promoter (figure 4a-c), consistent with the finding from reporter constructs. However, the changes in transcription levels did not directly translate into morphological changes, i.e. the pattern of ventral trichomes in larval cuticles (figure 4d,e): the DSCP knock-in rescued the knockout phenotype (severe depletion of trichomes, figure 1) to the wild-type level, but there were no apparent differences in the trichome patterns from the wild-type.

Discussion
Although it remains debatable whether enhancers and promoters are functionally different elements from a transcription svbp sequence log 2 (normalized intensity) perspective [19,20], there is evidence that they are under different selective pressures and possible evolutionary constraints. For example, comparative studies have shown that enhancer sequences undergo rapid sequence divergence while maintaining their regulatory functions via binding site turnover, consistent with stabilizing selection [38,39]. Gains and losses of enhancers were also found to be frequent in different lineages [22,39]. Promoters have been shown to exhibit higher levels of sequence divergence than surrounding regions in insects, possibly associated with an increased mutation rate [40]. Changes in promoters tend to be neutral [41], consistent with our findings. They have also been shown to evolve more slowly than enhancers in mammals [22]. Still, the level of constraint on promoters can differ among different types of promoters [42], with highly constrained promoters associated with developmental functions [43]. Empirical characterization of the mutational space of enhancers and promoters in a developmental context was only made possible recently through mutational scans [17] and automation of embryo handling [44]. Recent mutational scans have found that developmental enhancers encode dense regulatory information and are strongly constrained [17,18,36]. In this study, we found that Drosophila promoters have different mutational profiles from enhancers. At a comparable mutation rate to the previously published E3N enhancer library [17], variants in our promoter libraries did not show any changes in the pattern of gene expression (figures 2 and 3, electronic supplementary material, figure S5). By contrast, almost all mutant lines of E3N changed the pattern (figure 2, electronic supplementary material, figure S3). Mutations in promoters can change the level of expression in either direction (figures 2 and 3), whereas mutations in enhancers tended to reduce expression (figure 2) [17,18].
Together, these findings suggest that Drosophila promoters might be more robust to mutations than enhancers. Interestingly, this difference seems to exist in yeast promoters, if one considers a yeast promoter to be a mixture of enhancer (binding transcription factors) and promoter (initiating transcription) sequences: in the study of TDH3 promoter, mutations in transcription factor binding sites (enhancer) greatly reduced transcription whereas other mutations only fine-tuned the level of expression [15]. Additionally, our results indicated that promoters might have little potential to evolve new spatial patterns of expression, consistent with a previous finding that promoters were less likely to be repurposed as enhancers than the other way around in mammalian evolution [45]. However, this observation remains to be tested with more promoters and beyond the context of reporter constructs. Furthermore, the effects of promoter variants on the level of gene expression did not correlate with the number or the location (e.g. in TATA or other motifs) of mutations (figures 2 and 3), suggesting that regulatory information might be randomly distributed in these promoters and a saturated mutational scan might be required to fully decode the regulatory potential of promoter sequences.
When comparing svb promoter and DSCP, it is clear that the endogenous svb promoter had low activity (figure 1), consistent with previous views [19]. The fact that both svbp and DSCP had access to mutations that can increase the expression level (figures 2 and 3) suggested that the low activity of endogenous promoters might be a result of selection. Furthermore, the high-activity, artificially engineered promoter was more 'evolvable' (or 'breakable') in the sense that many mutations led to changes in the level of gene expression, whereas the low-activity, endogenous svb promoter was relatively robust to mutations, suggesting that developmental promoters might have evolved to encode robust transcriptional outputs. This robustness may facilitate evolvability through the rapid integration of developmental The relationship between the effects of cis-regulatory mutations on transcription and on fitness is often nonlinear [15,46]. In our study, we found that deletion of svb promoter led to a severe reduction of trichomes, similar to svb knockout phenotypes [47]. However, changes in promoter activity at the endogenous locus of svb by knocking-in DSCP did not cause a change in larval trichome patterns (figure 4), suggesting that changes in transcription level could be buffered by the downstream network [2], consistent with developmental traits being highly robust systems [48,49]. Alternatively, the nonlinear relationship could be explained by a threshold model, where the downstream patterning is elicited when svb transcription is above a certain threshold and a higher level of transcription does not change the patterning outcome [47].
Although promoters seem to be more robust to mutations than enhancers, the svb promoter shows a high level of sequence conservation, suggesting a certain degree of constraint. There could be a few explanations for this seemingly conflicted observation. First, promoters might be more 'essential' to transcription than enhancers because transcription is usually closely associated with one promoter but possibly multiple enhancers with redundant roles. Perturbation of promoters at their endogenous loci often has large phenotypic effects [50,51], whereas perturbation of redundant enhancers may only manifest its effects under challenging conditions [52][53][54]. Future mutational scans of promoters and enhancers at the endogenous locus that focus on fitness effects are expected to provide insights in this direction. Second, different enhancers might interact with different sequence motifs in the promoter [55] that constrain promoter sequences, but that were not explored in the current study. Examination of a combinatorial promoter-enhancer mutation library may be required to address this possibility.
Together, ours and previous studies [17,18] highlight the power of mutational scans in providing insights for developmental evolution. This approach allows us to fully explore 'the possible and the actual' [56] of cis-regulatory evolution, which is currently lacking, especially in a developmental context. The differential constraints observed in different cis-regulatory elements can help us predict where evolutionarily relevant substitutions could occur within a locus. They also support the previous findings that the evolution of svb consists of multiple small-effect substitutions throughout the locus in different Drosophila species [27,29]. In the future, mutational scans by allele replacement at the endogenous loci will provide further insights into the fitness landscape of regulatory elements in a developmental context, paralleling those in microorganisms [10,15] and cell lines [57].  Random mutation libraries of Drosophila synthetic core promoter (DSCP), hsp70 and svb promoters were synthesized at Genscript with a mutation rate of 10-20 point mutations per kb. In particular, the DSCP sequence (155 bp) was flanked by 50 bp-long sequences from hsp70p at each end, and the svbp sequence (226 bp) was flanked by 19 and 20 bp-long sequences from hsp70p at each end, respectively. The flanking sequences were also subjected to mutagenesis. The variants were cloned into E3N-placZattB [17] to replace the wild-type hsp70 promoter, which was positioned downstream of an E3N enhancer and upstream of a lacZ reporter [17]. The libraries were integrated into the fly genome at the attP2 site, with the injection service provided by GenetiVision. G0 transformants were crossed to w1118, and their offspring (F1) were screened for the presence of the construct by eye colour. The red-eye F1 flies were individually crossed to w1118 to establish isogenic lines, which were subsequently homozygosed by sibling crosses. The mutant lines were then sequenced to identify mutations in the promoters, with primer 5 0 -CCAAGTTGGTGGAGTTCATAATTCC-3 0 or 5 0 -AGGCATTGGGTGTGAGTTCTTC-3 0 . The sequences are listed in electronic supplementary material, table S1.
Additionally, we used two negative controls: 1) a construct with the E3N enhancer and LacZ but without any promoter; and 2) a construct with the E3N enhancer, a 200-bp-long 'inert' spacer sequence that was computationally screened for lack of transcription factor motifs in the place of the promoter (GAAGTTTCGACTAGTCTGAAACTTCTACACAGACCGTATTAGAACTAT TACTAGCTACAAGCTCCTAGTGCTTTGAAAGCTATAACCTTAAGATGC TGTTAGTATCTCAACCGACTTACTGCAGAGACTTGACGAATTCTGAAA GTTCAGAACTAGTCTCTGAGTTGCGAGGTACATTTAGCAATGTAAGAA CCTCGGCT), and LacZ. The control constructs were integrated at attP2 sites as described above.

(b) Embryo collection and immunostaining
Embryos were collected from an overnight laying period at 25°C, using a standard fixation protocol [18]. During fixation and staining, a wild-type promoter control was always included in each batch, to account for batch effects.
Expression of lacZ was detected with a chicken anti-βGal antibody (1 : 500, abcam ab9361). ELAV was stained with mouse anti-ELAV supernatant (1 : 25, Developmental Studies Hybridoma Bank Elav-9F8A9) as a fiducial marker for the automated imaging pipeline to rotate the images, as well as for the experimenter to visually stage the embryos. For DSCP, E3N and hsp70p libraries, as well as for comparing DSCP and svbp activity (data in figures 1 and 3, electronic supplementary material, figure S3 and figure S5), Alexa-Fluor 488 and 633 (1 : 500) were used as secondary antibodies for βGal and ELAV, respectively. Due to the extremely weak signal of svbp lines, we used extra staining steps for the svbp mutation library to enhance the signal (data in figure 2). After a secondary incubation of AlexaFluor 555/488 (goat anti chicken, 1 : 500), Biotin conjugate was used as tertiary antibody (donkey anti sheep, 1 : 500, 1 h incubation) and NeutrAvidin 550 was used for quaternary staining (1 : 500, 30min to 1 h incubation). AlexaFluor 488/647 (1 : 500) was used as the secondary antibody for ELAV in this case.
The stained embryos of DSCP and svbp libraries were mounted in ProLong Gold with DAPI. A subset of DSCP lines and all of the hsp70p lines were mounted in benzyl alcohol/ benzyl benzoate (BABB) [44] and were analysed qualitatively due to lower imaging quality. The mutation libraries were imaged on a Zeiss LSM 880 confocal microscope with an automated pipeline under a 20× objective (air, 0.8 NA) as previously described [44] or manually under the same setting. Embryos used for comparison between DSCP and svb promoter in figure 1 were imaged manually under a 25× (oil, 0.8 NA) objective.

(c) Quantification of lacZ expression
We focused on cells in the second abdominal stripe (A2) in stage 15 embryos for analysing the pattern and intensity of lacZ expression. In each embryo, the A2 region was manually selected, max-projected, and background-subtracted with a rolling ball radius of 50 pixels. To select for LacZ-expressing cells in the region, we first performed a Gaussian blur with a radius of 2 pixels to remove noise, and then identified regions of interest (ROIs) by automatically thresholding the image with the Otsu method in ImageJ (electronic supplementary material, figure S1A). The ROIs were applied to the background-subtracted image and analysed with the Analyze Particles function to extract mean intensity of each ROI. Mean intensity per embryo was calculated by where i R was the mean intensity and A R was the area of the ROI, respectively.
The embryo mean intensity of mutant lines was compared to the wild-type in the same batch with a Wilcoxon test. In the case where two biological replicates from different batches showed inconsistency in expression changes (i.e. one was different from wild-type and other was not), we took a conservative approach and removed both of them. A few svbp lines (57-4, 19-3, 64-14, 69-1, 89-13, 89-70, 90-7) were imaged along with two wildtype controls that were different from the control of other batches, and their intensities were scaled to the other control by linear conversion to eliminate differences caused by background variation in the different controls. The data were normalized to the control line in each batch when combined in one plot. Data from biological replicates were merged.
To compare expression level between DSCP and svbp, we extracted nucleus intensity by identifying local maxima with a prominence of 2000 and selecting a circular region with a radius of 0. 55 Figure 5. Model of enhancer and promoter evolution. Cell type-specific enhancers encode information for the location, levels and states of gene expression, whereas promoters encode information for the level of gene expression and integrate the transcriptional outputs from multiple enhancers. Promoters are relatively robust to mutations, allowing evolutionary changes through enhancers, including novel or co-opted changes.

(d) Allele replacement with CRISPR
We performed allele replacement following a two-step process, using a 3XP3-RFP marker as an intermediate step to easily select for integration events [58]. A 182 bp-long sequence of svb promoter immediately upstream of svb 5 0 UTR was targeted with two gRNAs, 5 0 -cgagatattcgccgttgctc-3 0 and 5 0 -gaatacagtaagttgcgagc-3 0 , which were cloned into pCFD4. A repair template containing the 3XP3-RFP sequence (1.86 kb) [58] and a 983bp-long homology arm at each end was synthesized and cloned into pUC57. The gRNA (75 ng ul −1 ) and the repair template (225 ng ul −1 ) were mixed and injected into a fly stock expressing Cas9 in the germline (BDSC#51324: w[1118]; PBac{y[ + mDint2] GFP[E.3xP3] = vas-Cas9}VK00027). Flies from the injection were crossed to an FM6 balancer and subsequently screened for RFP expression in the eyes, which indicates successful replacement of svb promoter by the 3XP3-RFP cassette. The RFP-positive transformants were then homozygosed for both RFP and GFP markers to establish a fly line for the second round of allele replacement.
In the second round, we replaced the 3XP3-RFP sequence with the DSCP sequence. The gRNAs were designed based on the fused sequence of svb locus and 3XP3-RFP cassette: 5 0 -GGTACCGTACGAGATCTCTC-3 0 and 5 0 -GGCGCCTAAGGATC-GATAGC-3 0 , cloned into pCFD4. The repair template contained a 255 bp-long DSCP sequence and the same homology arms as above. A mixture of plasmids carrying gRNAs and repair template was injected into the RFP-positive line mentioned above. Flies from the injection were crossed to an RFP/FM6 line and screened for loss of RFP. The resulting transformants were then homozygosed to establish a stable line of svbPromoterΔ::DSCP genotype. The integration was confirmed by PCR and sequencing. A list of fly strains used in this study is provided in electronic supplementary material, table S2.
(e) Fluorescent in situ hybridization svb transcripts were detected with DIG-labelled probes of svb as per [59]. Fixed Drosophila embryos were mounted in ProLong Gold + DAPI mounting media (Molecular Probes, Eugene, OR) and imaged on a Zeiss LSM 880 confocal microscope with Fas-tAiryscan under a 63× objective (Carl Zeiss Microscopy, Jena, Germany). Inside nuclei with svb transcription sites, the centre of the transcription site was identified using the find maximum function of Fiji/ImageJ. A circle with a diameter of 12 pixels [0.85 µm, region of interest (ROI)] centred on the transcription site was then created. The integrated fluorescent intensity inside the ROI was then reported. The intensity presented in the figures is the per-pixel average intensity with the maximum readout of the sensor normalized to 255.

(f ) Cuticle preparation
Embryos from an overnight laying period were dechorionated with bleach and left in distilled water at room temperature for 24 h. After 24 h, the hatched larvae were transferred onto a glass slide and mounted in Hoyer's medium mixed with lactic acid (1 : 1). The slide was baked at 55°C for 2 days before being imaged with dark field microscopy.
Data accessibility. All processed data generated during this study are included in the manuscript and supporting files.
The data are provided in electronic supplementary material [60].